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3 <110> APPLICANT: CALIFORNIA INSTITUTE OF TECHNOLOGY 

4 Debe, Derek A. 

6 <120> TITLE OF INVENTION: METHOD FOR DETERMINING THREE-DIMENSIONAL PROTEIN STRUCTURE 
FROM PRIMARY 

7 PROTEIN SEQUENCE 

9 <130> FILE REFERENCE: 265/297 
11 <14 0> CURRENT APPLICATION NUMBER: US 09/905,176 
C--> 12 <141> CURRENT FILING DATE: 2002-04-05 

14 <150> PRIOR APPLICATION NUMBER: US 60/218,016 

15 <151> PRIOR FILING DATE: 2000-07-12 
17 <160> NUMBER OF SEQ ID NOS : 26 

19 <170> SOFTWARE: Patentin version 3.1 

21 <210> SEQ ID NO: 1 

22 <211> LENGTH: 53 

23 <212> TYPE: PRT 

24 <213> ORGANISM: Artificial Sequence 

26 <220> FEATURE: 

27 <223> OTHER INFORMATION: Genus/species, Unknown 
29 <400> SEQUENCE: 1 

31 Leu Val Ala Phe Ala Asp Phe Gly Ser Val Thr Phe Thr Asn Ala Glu 

32 1 5 10 15 

3 5 Ala Thr Ser Gly Gly Ser Thr Val Gly Pro Ser Asp Ala Thr Val Met 
36 20 25 30 

3 9 Asp lie Glu Gin Asp Gly Ser Val Leu Thr Glu Thr Ser Val Ser Gly 
40 35 40 45 

4 3 Asp Ser Val Thr Val 
44 50 

47 <210> SEQ ID NO: 2 

48 <211> LENGTH: 53 

49 <212> TYPE: PRT 

50 <213> ORGANISM: Artificial Sequence 

52 <220> FEATURE: 

53 <223> OTHER INFORMATION: Genus/species , Unknown 

5 5 <400> SEQUENCE: 2 

57 Leu Val Ala Phe Ala Asp Phe Gly Ser Val Thr Phe Thr Asn Ala Glu 

58 1 5 10 15 

61 Ala Thr Ser Gly Gly Ser Thr Val Gly Pro Ser Asp Ala Thr Val Met 

62 20 25 30 

65 Asp He Glu Gin Asp Gly Ser Val Leu Thr Glu Thr Ser Val Ser Gly 



66 35 

69 Asp Ser Val Thr Val 

70 50 

73 <210> SEQ ID NO: 3 

74 <211> LENGTH: 53 



40 



45 
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75 <212> TYPE: PRT 

76 <213> ORGANISM: Artificial Sequence 

7 8 <220> FEATURE: 

79 <223> OTHER INFORMATION: Genus/species, Unknown 
81 <400> SEQUENCE: 3 

8 3 Leu Val Pro Phe Ala Asn Phe Gly Thr Val Thr Phe Thr Gly Ala Glu 
84 1 5 10 15 

87 Ala Thr Thr Ser Ser Gly Thr Val Thr Ala Ala Asp Ala Thr Leu lie 

88 20 25 30 

91 Asp lie Glu Gin Asn Gly Glu Val Leu Thr Ser Val Thr Val Ser Gly 

92 35 40 45 

95 Ser Thr Val Thr Val 

96 50 

99 <210> SEQ ID NO: 4 

100 <211> LENGTH: 52 

101 <212> TYPE: PRT 

102 <213> ORGANISM: Artificial Sequence 

104 <220> FEATURE: 

105 <223> OTHER INFORMATION: Genus/species , Unknown 
107 <400> SEQUENCE: 4 

109 Leu Val Gin Phe Ala Asn Phe Gly Thr Val Thr Phe Thr Gly Ala Ser 

110 15 10 15 

113 Ala Thr Gin Asn Gly Glu Ser Val Gly Val Thr Gly Ala Gin lie lie 

114 20 25 30 

117 Asp Leu Gin Gin Asn Ser Val Leu Thr Ser Val Ser Thr Ser Ser Asn 

118 35 40 45 

121 Ser val Thr Val 

122 50 

125 <210> SEQ ID NO: 5 

126 <211> LENGTH: 47 

127 <212> TYPE: PRT 

128 <213> ORGANISM: Artificial Sequence 

130 <220> FEATURE: 

131 <223> OTHER INFORMATION: Genus/species , Unknown 
133 <400> SEQUENCE: 5 

135 Leu Val Asn Phe Ala Asp Phe Asp Thr Val Thr Phe Lys Asp Cys Ser 

136 15 10 15 

139 Pro Ser Val Ser Gly Ser Thr lie Val Asp lie Arg Gin Ser Leu Glu 

140 20 25 30 

14 3 Val Leu Thr Glu Cys Ser Thr Thr Gly Thr Thr Thr Val Thr Cys 
144 35 40 45 

147 <210> SEQ ID NO: 6 

148 <211> LENGTH: 54 

149 <212> TYPE: PRT 

150 <213> ORGANISM: Artificial Sequence 

152 <220> FEATURE: 

153 <223> OTHER INFORMATION: Genus/species , Unknown 
155 <400> SEQUENCE: 6 

157 Phe Val Pro Phe Ala Ser Phe Ser Pro Ala Val Glu Phe Thr Asp Cys 
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158 15 10 15 

161 Ser Val Thr Ser Asp Gly Glu Ser Val Ser Leu Asp Asp Ala Gin lie 

162 20 25 30 

165 Thr Gin Val He He Asn Asn Gin Asp Val Thr Asp Cys Ser Val Ser 

166 35 40 45 

169 Gly Thr Thr Val Ser Cys 

170 50 

173 <210> SEQ ID NO: 7 

174 <211> LENGTH: 54 

175 <212> TYPE: PRT 

176 <213> ORGANISM: Artificial Sequence 

178 <220> FEATURE: 

179 <223> OTHER INFORMATION: Genus/species , Unknown 
181 <400> SEQUENCE: 7 

183 Phe Val Pro Phe Ala Ser Phe Ser Pro Ala Val Glu Phe Thr Asp Cys 

184 15 10 15 

187 Ser Val Thr Ser Asp Gly Glu Ser Val Ser Leu Asp Asp Ala Gin He 

188 20 25 30 

191 Thr Gin Val He He Asn Asn Gin Asp Val Thr Asp Cys Ser Val Ser 

192 35 40 45 

195 Gly Thr Thr Val Ser Cys 

196 50 

199 <210> SEQ ID NO: 8 

200 <211> LENGTH: 54 

201 <212> TYPE: PRT 

202 <213> ORGANISM: Pseudomonas aeruginosa 
204 <400> SEQUENCE: 8 

206 Phe Val Pro Phe Ala Ser Phe Ser Pro Ala Val Glu Phe Thr Asp Cys 

207 15 10 15 

210 Ser Val Thr Ser Asp Gly Glu Ser Val Ser Leu Asp Asp Ala Gin He 

211 20 25 30 

214 Thr Gin Val He He Asn Asn Gin Asp Val Thr Asp Cys Ser Val Ser 

215 35 40 45 

218 Gly Thr Thr Val Ser Cys 

219 50 

222 <210> SEQ ID NO: 9 

223 <211> LENGTH: 326 

224 <212> TYPE: PRT 

225 <213> ORGANISM: Saccharoinyces cerevisiae 



227 


<400> SEQUENCE: 


9 






















229 


Tyr Pro Tyr 


Thr 


Arg 


Leu Arg 


Arg 


Asn 


Arg 


Arg 


Asp 


Asp 


Phe 


Ser 


Arg 


230 


1 




5 








10 










15 




233 


Arg Leu Val 


Arg 


Glu 


Asn Val 


Leu 


Thr 


Val 


Asp 


Asp 


Leu 


He 


Leu 


Pro 


234 


20 








25 










30 






237 


Val Phe Val 


Leu 


Asp 


Gly Val 


Asn 


Gin 


Arg 


Glu 


Ser 


He 


Pro 


Ser 


Met 


238 


35 








40 










45 








241 


Pro Gly Val 


Glu 


Arg 


Leu Ser 


He 


Asp 


Gin 


Leu 


Leu 


He 


Glu 


Ala 


Glu 


242 


50 






55 










60 










245 


Glu Trp Val 


Ala 


Leu 


Gly He 


Pro 


Ala 


Leu 


Ala 


Leu 


Phe 


Pro 


Val 


Thr 
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9 4. 

Z *t D 


65 










70 










75 










80 


^ *± J? 


Pro 


Val 


Glu 


Lys 


LVS 

1 


Ser 


Leu 


Asp 


Ala 


Ala 


Glu 


Ala 


Tyr 


Asn 


Pro 


Glu 












85 










90 










95 




253 


Gly 


lie 


Ala 


Gin 


Arg 


Ala 


Thr 


Arg 


Ala 


Leu 


Arg 


Glu 


Arg 


Phe 


Pro 


Glu 








100 










105 










110 






257 


Leu 


Glv 


lie 


He 


Thr 


Asp 

^ It 


Val 


Ala 


Leu 


Asp 


Pro 


Phe 


Thr 


Thr 


His 


Gly 


Z .J o 




115 










120 










125 








Z U J- 


Gin 


Asp 


Glv 


He 


Leu 


Asp 


Asp 


Asp 


Gly 


Tyr 


Val 


Leu 


Asn 


Asp 


Val 


Ser 


z u z 




130 










135 










140 










Z U J 


lie 


Asp 


Val 


Leu 


Val 


Arcr 


Gin 


Ala 


Leu 


Ser 


His 


Ala 


Glu 


Ala 


Gly Ala 


9 

z u u 


145 








150 










155 










160 


9fiQ 

Z U -7 


Gin 


Val 


Val 


Ala 


Pro 


Ser 


Asp 


Met 


Met 


Asp 


Gly Arg 


He Gly Ala 


He 


970 










165 










170 










175 




97 ^ 
z / o 




Glu 


Ala 


Leu 


Glu 


Ser 


Ala 


Gly 


His 


Thr 


Asn 


Val 


Arg 


He 


Met 


Ala 


974 
z / *± 






180 










185 










190 






277 


Tvr 
i y j- 


Ser 


Ala 


Lys 


Tvr 


Ala 


Ser 


Ala 


Tyr 


Tyr 


Gly 


Pro 


Phe 


Arg 


Asp 


Ala 


97 ft 
z / o 




195 










200 










205 








281 


Val 


Gly 


Ser 


Ala 


Ser 


Asn 


Leu Gly 


Lys 


Gly 


Asn 


Lys 


Ala 


Thr 


Tyr 


Gin 


282 




210 










215 










220 










285 


Met 


Asp 


Pro 


Ala 


Asn 


Ser 


Asp 


Glu 


Ala 


Leu 


His 


Glu 


Val 


Ala 


Ala 


Asp 


286 


225 








230 










235 










240 


289 


Leu 


Ala 


Glu 


Gly 


Ala 


Asp 


Met 


Val 


Met 


Val 


Lys 


Pro 


Gly Met 


Pro 


Tyr 


290 










245 










250 










255 




293 


Leu 


Asp 


He 


Val 


Arg 


Arg 


Val 


Lys 


Asp 


Glu 


Phe 


Arg 


Ala 


Pro 


Thr 


Phe 


294 






260 










265 










270 






297 


Val 


Tyr 


Gin 


Val 


Ser 


Gly Glu 


Tyr 


Ala 


Met 


His 


Met 


Gly Ala 


He 


Gin 


298 




275 










280 










285 








301 


Asn 


Gly 


Trp 


Leu 


Ala 


Glu 


Ser 


Val 


He 


Leu 


Glu 


Ser 


Leu 


Thr 


Ala 


Phe 


302 




290 










295 










300 










305 


Lys 


Arg 


Ala 


Gly Ala 


Asp 


Gly 


He 


Leu 


Thr 


Tyr 


Phe 


Ala 


Lys 


Gin 


Ala 


306 


305 








310 










315 










320 


309 


Ala 


Glu 


Gin 


Leu 


Arg 


Arg 























310 325 

313 <210> SEQ ID NO: 10 

314 <211> LENGTH: 328 

315 <212> TYPE: PRT 

316 <213> ORGANISM: Saccharomyces cerevisiae 
318 <400> SEQUENCE: 10 



320 


Glu 


He 


Ser 


Ser 


Val 


Leu 


Ala 


Gly 


Gly 


Tyr 


Asn 


His 


Pro 


Leu 


Leu 


Arg 


321 


1 








5 










10 










15 




324 


Gin 


Trp 


Gin 


Ser 


Glu 


Arg 


Gin 


Leu 


Thr 


Lys 


Asn 


Met 


Leu 


He 


Phe 


Pro 


325 






20 










25 










30 






328 


Leu 


Phe 


He 


Ser 


Asp 


Asn 


Pro 


Asp 


Asp 


Phe 


Thr 


Glu 


He 


Asp 


Ser 


Leu 


329 






35 










40 










45 








332 


Pro 


Asn 


He 


Asn 


Arg 


He 


Gly 


Val 


Asn 


Arg 


Leu 


Lys 


Asp 


Tyr 


Leu 


Lys 


333 




50 










55 










60 










336 


Pro 


Leu 


Val 


Ala 


Lys 


Gly 


Leu 


Arg 


Ser 


Val 


He 


Leu 


Phe Gly Val 


Pro 


337 


65 










70 










75 










80 


340 


Leu 


He 


Pro 


Gly 


Thr 


Lys 


Asp 


Pro 


Val 


Gly 


Thr 


Ala 


Ala 


Asp 


Asp 


Pro 
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341 85 90 95 

344 Ala Gly Pro Val He Gin Gly He Lys Phe He Arg Glu Tyr Phe Pro 

345 100 105 110 

348 Glu Leu Tyr He He Cys Asp Val Cys Leu Cys Glu Tyr Thr Ser His 

349 115 120 125 

352 Gly His Cys Gly Val Leu Tyr Asp Asp Gly Thr He Asn Arg Glu Arg 

353 130 135 140 

356 Ser Val Ser Arg Leu Ala Ala Val Ala Val Asn Tyr Ala Lys Ala Gly 

357 145 150 155 160 

360 Ala His Cys Val Ala Pro Ser Asp Met He Asp Gly Arg He Arg Asp 

361 165 170 175 

364 He Lys Arg Gly Leu He Asn Ala Asn Leu Ala His Lys Thr Phe Val 

365 180 185 190 

3 68 Leu Ser Tyr Ala Ala Lys Phe Ser Gly Asn Leu Tyr Gly Pro Phe Arg 
369 195 200 205 

372 Asp Ala Ala Cys Ser Ala Pro Ser Asn Gly Asp Arg Lys Cys Tyr Gin 

373 210 215 220 

376 Leu Pro Pro Ala Gly Arg Gly Leu Ala Arg Arg Ala Leu Glu Arg Asp 

377 225 230 235 240 

380 Met Ser Glu Gly Ala Asp Gly He He Val Lys Pro Ser Thr Phe Tyr 

381 245 250 255 

384 Leu Asp He Met Arg Asp Ala Ser Glu He Cys Lys Asp Leu Pro He 

385 260 265 270 

388 Cys Ala Tyr His Val Ser Gly Glu Tyr Ala Met Leu His Ala Ala Ala 

389 275 280 285 

392 Glu Lys Gly Val Val Asp Leu Lys Thr He Ala Phe Glu Ser His Gin 

393 290 295 300 

396 Gly Phe Leu Arg Ala Gly Ala Arg Leu He He Thr Tyr Leu Ala Pro 

397 305 310 315 320 

400 Glu Phe Leu Asp Trp Leu Asp Glu 

401 325 

404 <210> SEQ ID NO: 11 

405 <211> LENGTH: 215 

406 <212> TYPE: PRT 

407 <213> ORGANISM: Homo sapiens 
409 <400> SEQUENCE: 11 

411 Ala Gly His Asp Asn Thr Lys Pro Asp Thr Ser Ser Ser Leu Leu Thr 

412 15 10 15 

415 Ser Leu Asn Gin Leu Gly Glu Arg Gin Leu Leu Ser Val Val Lys Trp 

416 20 25 30 

419 Ser Lys Ser Leu Pro Gly Phe Arg Asn Leu His He Asp Asp Gin He 

420 35 40 45 

423 Thr Leu He Gin Tyr Ser Trp Met Ser Leu Met Val Phe Gly Leu Gly 

424 50 55 60 

427 Trp Arg Ser Tyr Lys His Val Ser Gly Gin Met Leu Tyr Phe Ala Pro 

428 65 70 75 80 

4 31 Asp Leu He Leu Asn Glu Gin Arg Met Lys Glu Ser Ser Phe Tyr Ser 

432 85 90 95 

435 Leu Cys Leu Thr Met Trp Gin He Pro Gin' Glu Phe Val Lys Leu Gin 
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